Random matrix approach to the dynamics of stock inventory variations 
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Investors trade stocks based on diverse strategies trying to beat the market and gain access 
returns, whose stock inventories change accordingly. The dynamics ol inventory variations thus 
contain rich information about the trading behaviors of investors and have crucial influence on price 
fluctuations. We study the cross-correlation matrix dj of inventory variations of the most active 
individual and institutional investors in an emerging market to understand the dynamics of inventory 
variations. We find that the distribution of cross-correlation coefficient dj has a power-law form in 
the bulk followed by exponential tails and there are more positive coefficients than negative ones. In 
addition, it is more possible that two individuals or two institutions have stronger inventory variation 
correlation than one individual and one institution. We find that the largest and the second largest 
eigenvalues (Ai and A2) of the correlation matrix cannot be explained by the random matrix theory 
and the projection of inventory variations on the first eigenvector m(Ai) are linearly correlated with 
stock returns, where individual investors play a dominating role. The investors are classified into 
three categories based on the cross-correlation coefficients Cvr between inventory variations and 
stock returns. Half individuals are reversing investors who exhibit evident buy and sell herding 
behaviors, while 6% individuals are trending investors. For institutions, only 10% and 8% investors 
are trending and reversing investors. A strong Granger causality is unveiled from stock returns to 
inventory variations, which means that a large proportion of individuals hold the reversing trading 
strategy and a small part of individuals hold the trending strategy. Comparing with the case of 
Spanish market, Chinese investors exhibit common and market-specific behaviors. Our empirical 
findings have scientific significance in the understanding of investors' trading behaviors and in the 
construction of agent-based models for stock markets. 

PACS numbers: 89.65.Gh, 89.75.Da, 02.10.Yn, 05.45.Tp 



I. INTRODUCTION 

Stock markets are complex systems, whose elements 
are heterogenous individual and institutional investors 
interacting with each other by stock exchanges [H-[l]- 
Stock price fluctuates due to investors' trading activi- 
ties and the cross-sectional relation between investors' 
stock inventory variations and stock returns have at- 
tracted much attention Q . The huge literature falls into 
three groups to study the relation between past returns 
and inventory variations, to investigate the contempo- 
raneous relation between inventory variations and stock 
returns, and to analysis return predictability of inventory 
variations [f| . The main findings are that institutions are 
trending investors adopting the momentum trading strat- 
egy [6|— lS| , while individuals are reversing investors who 
buy previous losers and sell previous winners d, H-[ll| , 
and stock returns lead inventory variations but not vice 
versa (H-Q. 

However, there is evidence showing different trading 
patterns. Lillo et al investigated the trading behaviors of 
about 80 firms that were members of the Spanish Stock 
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Exchange and found that there were more reversing firms 
than trending firms @ . They also found that the largest 
eigenvalue of the correlation matrix of inventory varia- 
tions cannot be explained by the random matrix theory 
and its eigenvector contains information of stock price 
fluctuations. Both buying and selling herding behaviors 
have been observed for trending and reversing firms. 

In this work, we perform a similar analysis as in Ref. [j| 
based on the trading records of Chinese investors in the 
Shenzhen Stock Exchange. Different from the Spanish 
case, our data set contains both individual and institu- 
tional investors, which allows us to observe interesting 
investor behaviors. Our analysis starts from the perspec- 
tive of random matrix theory, which has been extensively 
used to investigate the cross-correlations of financial re- 
turns in different stock markets fl2l - fl4l ]. However, very 
few studies have been conducted on the Chinese stocks 
[ToT ] and, to our knowledge, there is no research reported 
on the dynamics of inventory variations of Chinese in- 
vestors. Alternatively, there are studies on Chinese equi- 
ties at the transaction and trader level from the complex 
network perspective ^6Hl9j . 

This paper is organized as follows. Section |H] de- 
scribes the data and the method to construct the time 
series of investors' inventory variations. Section |HT] stud- 
ies the statistical properties of the elements, eigenvalues 
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and eigenvectors of the correlation matrix of inventory 
variations. Section [IVI investigates the contemporaneous 
and lagged cross-correlation between inventors' inventory 
variations and stock returns to divide investors into three 
categories and their herding behaviors. Section [V] sum- 
marizes our findings. 



II. DATA 

We analyze 39 stocks actively traded on the Shenzhen 
Stock Exchange in 2003. The data base contains all the 
information needed for the analysis in this work. For 
each transaction i of a given stock, the data record the 
identities of the buyer and seller, the types (individual 
or institution) of the two traders, the price pi and the 
size qi of the trade, and the time stamp. Therefore, the 
trading history of each investor is known. For each stock, 
we identify active traders who had more than 150 trans- 
actions, amount to about three transactions per week. 
If the number of active traders of a stock is less than 
120, we exclude it from analysis. In this way, we have 15 
stocks for analysis. 

Following Ref. @, we investigate the dynamics of the 
inventory variation of the most active investors who ex- 
ecuted more than 120 transactions for each stock. Al- 
though the trading period of each day consists of call 
auction and continuous auction, their behaviors are dif- 
ferent in many aspects and are usually studied separately 
[20l l2l| . We stress that all the transactions in both call 
auction and continuous auction are included in our in- 
vestigation. The daily inventory variation of an investor 
i trading a given stock on day t is defined as follows 



(1) 



where ^2, + Pi(t)qi(t) is the total buy quantity on trad- 
ing day t and Pi(t)qi(t) is the total sell quantity in 
the same day. The basic statistics of the 80 most active 
traders and the resultant inventory variations are given 
in Table |U 



III. STATISTICS OF CORRELATION MATRIX 
BETWEEN TWO TIME SERIES OF INVENTORY 
VARIATIONS 

A. Distributions of cross-correlation coefficients 

The empirical correlation matrix C is constructed from 
the time series of inventory variation m (t) of the investi- 
gated stock, defined as 



{{Vi - (Vi))(Vj - (Vj))) 



(2) 



Since the results for individual stocks are quantitatively 
similar, we put the cross-correlation coefficients of the 15 



stocks into one sample. We find that the mean value 
is (Cij) = 0.02 for the real data and for the shuf- 
fled data. When the types of investors are taken into 
account, the mean value of the cross-correlation coeffi- 
cients is (Cij) = 0.048 (shuffled: -0.001) for both i and 
j being individuals, (Cij) — 0.014 (shuffled: 0) for both 
i and j being institutions, and (Cij) = —0.008 (shuffled: 
—0.001) for i being individual and j being institution. 

Figure Q] plots the daily returns of stock 000001 and 
the sliding average values of the correlation coefficients 
(Cij) for comparison. We observe that large values (Cij) 
appear during periods of large price fluctuations by and 
large, which is reminiscent of the similar result for cross- 
correlations of financial returns fl4| . However, the short 
time period of our data sample does not allow us to reach 
a decisive conclusion. There are also less volatile time 
periods with large (Cij). The situation is quite similar 
for other stocks. 
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FIG. 1. (Color online) Evolution of the 5-day average cross- 
correlation coefficient (Cij) and the daily return R. 

Figure [U(a) shows the empirical probability distribu- 
tions of C^ which is calculated using daily inventory vari- 
ation. The four curves with different markers correspond 

and Ci nSi i ns , respectively. It is 
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found that most coefficients are small and the tails are 
exponentials: 
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-0.6 < C < -0.1 
O.K C< 0.6 



(3) 



where A+ = 8.8 ± 0.2 and A_ = 11.1 ± 0.3 for Cy, A+ = 
8.9±0.2and A_ = 12.5 ±0.4 for C indjind , A + = 10.8±0.4 
and A_ = 10.5 ± 0.4 for Ci n d,ins, and A + = 6.6 ± 0.5 and 
A_ = 8.7±0.5 for Ci ns 4 ns , respectively. We find that there 
are more positive cross-correlation coefficients (A + < A_) 
when both investors are individuals or institutions. In 
contrast, the distribution is symmetric (A+ ~ A_) when 
one investor is an individual while the other is an insti- 
tution. This finding implies that herding behaviors are 
more like to occur among the same type of investors and 
individuals have larger probability to herd than institu- 
tions. We shuffle the original time series and perform the 
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TABLE I. Basic statistics of the investigated stocks. The first column is the stock code, which is the unique identity of each 
stock. The second and third columns presents investor-averaged total inventory variation (^2 f v)i and average absolute variation 
{{\v\) t )i. The fourth to eighth columns gives the number of investors N, the number of trending investors N tr , the number 
of reversing investors A" re , the number of uncategorized investors N un , and the slope of the factor versus stock return k. The 
variables in the ninth to thirteenth columns are the same as in the five "all investors" columns but for individual investors 
and the last five columns are for institutional investors. Each value in the last row gives the sum of the numbers in the same 
column. 
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same analysis. The resulting distributions collapse onto 
a single curve, which has an exponential form 



P{C) = A shuf e- 



(4) 



where the parameter Asw = 23.3 is determined us- 
ing robust regression [22, l23[. It is not surprising that 
real data have higher cross-correlations than the shuffled 
data, which is confirmed by X± < A s i m f ■ This exponential 
distribution is different from the Gaussian distribution 
for the shuffled data of financial returns (l4j |. 

Figure [2][b) plots the distributions in double logarith- 
mic coordinates where the negative parts are reflected to 
the right with respect to CV, = 0. Nice power laws are 
observed spanning over three orders of magnitude: 
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where 7+ = 0.69 ± 0.01 and 7 _ = 0.69 ± 0.01 for Cy, 
7+ = 0.67 ± 0.02 and 7_ = 0.62 ± 0.02 for C* ind ,i„d, 
7+ = 0.67 ± 0.02 and 7 _ = 0.70 ± 0.02 for C ind , ins , and 
7+ = 0.72 ±0.02 and 7 _ = 0.73 ±0.02 for C infMns ,' respec- 
tively. It is found that 7 _ w 7+ and all the power-law 
exponents are close to each other. An intriguing feature 
is that the distributions of Cy, Ci n d,ind and Ci n( j.ins ex- 
hibit an evident bimodal behavior, which is reminiscent 
of the distributions of waiting times and interevent times 
of human short message communication f2~H . Certainly, 
the underlying mechanisms arc different and the factors 
causing the bimodal distribution of the cross-correlations 
arc unclear. 



It is natural that we are more interested in large 
cross-correlations. The preceding discussions focus on 
the cross-correlations not larger than 0.6. As shown in 
Fig- HI a), there are pairs of inventory variation time series 
that have very large cross-correlations that look like out- 
liers. To have a better visibility, we plot in Fig. (5Jc) the 
numbers of occurrences of positive and negative cross- 
correlations in 10 nonoverlapping intervals for the four 
types of pairs. It is shown that N(C > 0) > N(C < 0) 
in all intervals for C = Cij, Ci n d,ind and Cj n d,ms- In 
contrast, iV(Cind, ins > 0) < A^Cind, ins < 0) when 
Cind, ins < 0.6 and N{C\nd, ins > 0) > N{CmA, ins < 0) 
when Cind, ins > 0.6. Hence, for larger cross-correlations 
(C > 0.6), there are much more occurrences of positive 
cross-correlations than negative ones for all the four types 
of pairs. This striking feature can be attributed to two 
reasons. The first is that a large proportion of investors 
react to the same external news in the same direction 
[IJ: they buy following good news and sell following bad 
news. The second is that investors imitate the trading 
behaviors of others of the same type and rarely imitate 
other investors of different type. The second reason is ra- 
tional because the friends of individual (or institutional) 
investors are more likely individual (or institutional) in- 
vestors. 



B. Eigenvalue spectrum 

For the correlation matrix C of each stock, we can 
calculate its eigenvalues, whose density / C (A) is defined 
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FIG. 2. (Color online) Empirical distributions of the cross- 
correlation coefficients for all the 15 stocks, (a) Log-linear plot 
of P(dj) for cross-correlations between any two investors, 
any two individuals, any one individual and one institution, 
and any two institutions, which shows exponential forms when 
0.1 < |C| < 0.6. The dashed line corresponds to the result of 
shuffled data, (b) Log-log plot of P(Cij), which shows power- 
law forms when 10 -5 < |C| < 0.01. (c) Comparison of oc- 
currence numbers of positive and negative cross-correlations. 
The ordinate gives \og 10 [l + N(dj)] rather than \og 10 [N(Cij)] 
for better presentation. 



as follows 121 . 



/c(A) 



1 dn(X) 
N dX 



(6) 



where n(X) is the number of eigenvalues of C less than 
A. If M is a T x N random matrix with zero mean 
and unit variance, / C (A) is self-averaging. Particularly, 
in the limit N ™> oo, T -> oo and Q = T/N > 1 fixed, 
the probability density function / C (A) of eigenvalues A of 
the random correlation matrix M can be described as 



p ( \ \ Q V (Amax — A)(A — A min ) 

with A G [Amin, A max ], where X™f£ is given by 

a 2 (l + l/Q±2^/T/Q) , (8) 



\ max 
^min 



and a 2 is equal to the variance of the elements of M 
[I2T [25| . The variance a 1 is equal to 1 in our normalized 
data. 




FIG. 3. (Color online) Eigenvalue spectrum of the correlation 
matrix of inventory variation of investors trading stock 000001 
within 1 day time horizon in 2003. The solid line is the spec- 
tral density obtained by shuffling independently the buyers 
and the sellers in such a way to maintain the same number of 
purchases and sales for each investor as in the real data. The 
dashed blue line shows the spectral density predicted by the 
random matrix theory using Eq. (0 with Q — 237/80 = 2.96. 
The inset shows the largest eigenvalue Ai (O) an d the sec- 
ond largest eigenvalue A2 (□) of all 15 investigated data sets 
from 15 stocks. The solid line indicates the upper thresholds 
by shuffling experiments, and the dashed line presents the 
threshold predicted by the random matrix theory. 

Figurc[3]illustrates the probability distribution / C (A) of 
the correlation matrix of inventory variation of investors 
trading stock 000001. The solid line is the spectral den- 
sity obtained by shuffling independently the buyers and 
the sellers in such a way to maintain the same number of 
purchases and sales for each investor as in the real data, 
while the dashed blue line shows the spectral density pre- 
dicted by the random matrix theory using Eq. ([7]) with 
Q = 237/80 = 2.96. We find the largest eigenvalue is well 
outside of the bulk and the second largest eigenvalue also 
escapes the bulk. The results for other 14 stocks are quite 
similar. In the inset of Fig. [3[ we plot the largest eigen- 
values Ai and the second largest eigenvalues A2 for all the 
15 stocks. We find that all the largest eigenvalues are well 
above the upper thresholds determined from shuffling ex- 
periments and the thresholds A ma x in Eq. 1(5)) predicted 
by the random matrix theory. Moreover, all the second 
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largest eigenvalues are above the two threshold lines and 
some of them are well above the thresholds. These find- 
ings indicates that both the largest and the second largest 
eigenvalues carry information about the investors, which 
is different from different from the results of the Span- 
ish stock market, where only the largest eigenvalue is 
larger than the up thresholds while the second largest 
eigenvalue is within the bulk [H[. This discrepancy can 
be attributed to the difference of the two markets and 
the fact that our analysis contains both individuals and 
institutions while Lillo et al studies only firms. 



C. Distribution of eigenvector components 

If there is no information contained in an eigenvalue, 
the normalized components of its associated eigenvector 
should conform to a Gaussian distribution 12l- 14|: 



/(«) 



/^ exp ry 



(9) 



Since the empirical eigenvalue distribution / C (A) deviates 
from the theoretic expression (J6j) from the random ma- 
trix theory with two large eigenvalues outside the bulk of 
the distribution, it is expected that the associated eigen- 
vectors also contain certain information. 
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Eigenvector components, u 

FIG. 4. Distribution of eigenvector components u: (a) all 
the eigenvectors associated with the eigenvalues in the bulk 
A m in < A < A max after normalization for each eigenvector for 
stock 000001, (b) same as (a) but for stock 200625, (c) all the 
eigenvectors associated with the largest eigenvalues Ai after 
normalization for all the stocks, and (d) all the eigenvectors 
associated with the second largest eigenvalues A2 after nor- 
malization for all the stocks. The solid lines show the Gaus- 
sian distribution predicted by the random matrix theory. 



For correlation matrices of financial returns, the com- 
ponents of an eigenvector with the eigenvalue A in the 
bulk of its distribution (A m ; n < A < A max ) are dis- 
tributed according to Eq. © fr34li |. Panels (a) and (b) 



in Fig. |4^how the empirical distributions of the eigenvec- 
tor components u with the eigenvalues in the bulk for 
two typical stocks. Rather than analyzing the vector for 
one eigenvector, we normalized the components of each 
eigenvector and put all the eigenvectors together to gain 
better statistics, since each eigenvector has only 80 com- 
ponents. We find that the distributions of 10 stocks are 
well consistent with the Gaussian, while other 5 stocks 
exhibit high peaks in the center. 

For deviating eigenvalues Ai and A2, the distribution 
for each stock is very noisy and deviates from Gaussian. 
We treat the components of the 15 eigenvectors as a sam- 
ple to have better statistics. The two distributions ob- 
tained are illustrated in Fig. HJc) and (d). It is evident 
that both deviate from the Gaussian distribution and the 
distribution for Ai is more skewed. 



D. Information in eigenvectors for deviating 
eigenvalues 

Wc have shown that the largest and the second largest 
eigenvalues deviate from the RMT prediction and the dis- 
tributions of their eigenvector components are not Gaus- 
sian. It implies that these eigenvectors carry some infor- 
mation. For u{\2), it is not clear what kind of informa- 
tion they carry. We find no evident dependence of the 
magnitude of Ui{\2) on the average absolution inventory 
variation (|i>i|), the total variation Y] Vj, or the maxi- 
mal absolute variation max{|i>i|}. Same conclusion is ob- 
tained for Ui(Xi), which differs from the conclusion that 
the eigenvector components of the return correlation ma- 
trix depend on the market capitalization in a logarithmic 
form [14| . In addition, as we will show in the next section 
that the investors can be categorized into three trading 
types. We also find no relation between the trading strat- 
egy category and the magnitude of the vector component 
u(\2). We thus focus on extracting the information from 
u(Ai). 

For the correlation matrix whose elements are the cor- 
relation coefficients of price fluctuations of two stocks, 
the eigenvector of the largest eigenvalue contains mar- 
ket information [l2l [l3j . The market information indi- 
cates the collective behavior of stock price movements 
[26j . which can be unveiled by the projection of the 
time series on the eigenvector QJJ. We follow this ap- 
proach and calculate the projection G(t) of the time se- 
ries Vi(t) = [vi(t) — (vi(t))]/<Ti on the eigenvector u(Ai) 
corresponding to the first eigenvalue [f| : 



G(t) = £Vi(t) X Ui(Ai)(i), 



(10) 



The projection G can be called the factor associated 
with the largest eigenvalue @. We plot the factor G{t) 
against the normalized return R{t) for stock 000001 in 
Fig. El^a). There is a nice linear dependence between 
the two variables and a linear regression gives the slope 
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k = 0.83 ± 0.04. It indicates that these most active in- 
vestors have dominating influence on the price fluctua- 
tions. 
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FIG. 5. (Color online) Influence of the most active investors 
trading stock 00000 1 on the price fluctuations for all the inves- 
tigated investors (a), for individual investors (b), and for in- 
stitutional investors (c), where the slopes are k = 0.83 ±0.04, 
fcind = 0.83 ± 0.04, and fc ins = 0.41 ± 0.06, respectively. Panel 
(d) plots fci n( j and fe ns against k, where each symbol corre- 
sponds to a stock. 

Panels (b) and (c) of Fig. [5] illustrate the relation be- 
tween the factor and the return for individuals and insti- 
tutions. Linear regression gives k[ n d = 0.83 ± 0.04 and 
hns = 0.41 ± 0.06. Comparing (b) and (c) with (a), we 
find that the influence of individuals matches excellently 
with the whole sample, which can be quantified by the 
facts that fend = k and fc; n d < k. The results are similar 
for other stocks. The resulting fc; n d and fc; ns are plotted 
in Fgi. [5] against k for all the 15 stocks. For individual 
investors, we find that fcj n d = k for 13 stocks and fcj n d > k 
for 2 stocks. In contrast, we find that fc; ns < k except for 
one stock. 



IV. INVENTORY VARIATION AND STOCK 
RETURN 



comparing the experimental results with the results of 
a null hypothesis based on a block bootstrap of both R 
and V. In this regard, 1000 block bootstrap replicas with 
a block length of 20 are performed. For each investor, 
we have checked whether the estimated correlation with 
return exceeds the 0.97725 quantile or is smaller than 
the 0.02275 quantile of the correlation distribution ob- 
tained from bootstrap replicas. The results are shown in 
Fig. [6](a-c) . There are 1211 investors in the whole sam- 
ple in Fig. (nja), including 453 institutional investors in 
Fig- [lib) and 758 individual investors in Fig. |5Jc). 

As shown in the last row of Table U the numbers of the 
three kinds of investors (trending, reversing and uncate- 
gorized) are 46, 34 and 373 for institutional investors and 
41, 381 and 336 for individual investors, respectively. We 
find that most institutional investors arc uncategorized 
and there are more trending investors than reversing in- 
vestors. These results are different from the Spanish case, 
where only one-third investors are uncategorized and the 
number of reversing firm investors is about three times 
the number of trending firm investors @. In contrast, 
about half individuals are reversing investors and only 
6% individuals are trending investors. The observation 
that most investors are uncategorized is probably due to 
the fact that the Chinese market was emerging and its in- 
vestors are not experienced. Comparing individuals and 
institutions, we find a larger proportion of individuals 
exhibiting a reversing behavior. It indicates that these 
individuals buy when the price drops and sell when the 
price rises in the same day. This finding is very inter- 
esting since it explains the worse performance in stock 
markets [19, El- 

The empirical evidence for the significant cross- 
correlation between inventory variation Vi(t) of trending 
and reversing investors and stock return R(t) leads us 
to adopt a linear model for the dynamics of inventory 
variation as a first approximation |5|: 



Vi(t) = yiR(t) 



(12) 



where ji is proportional to the cross-correlation coef- 
ficient Cvi.R- F follows immediately that the cross- 
correlation coefficient between the inventory variations 
of two investors are 



A. Categorization of investors 

Following Ref. [f|, we divide the investors into three 
categories according to the cross-correlation coefficient 
CviR between the inventory variation Vi and the stock 
return R. The investor i belongs to the trending or re- 
versing category if its inventory variation is positively or 
negatively correlated with the return. We use a wieldy 
significant threshold to categorize the investors: 

±2ct = ±2/v/7Vr , (11) 

where is the number of time records for each time 
series @. We also verify the robustness of Eq. (|11[) by 



Cij = C Vi! Vi = 7»7j- ( 13 ) 

If two investors belong to the same category, either trend- 
ing (ji > 2cr and 7,' > 2a significantly) or reversing 
(7; < —2a and jj < —2a significantly), the value of 
is expected to be significantly positive. On the contrary, 
if two investors belong respectively to the trending and 
reversing categories, the value of Cij is expected to be 
significantly negative. To show the performance of the 
model, we plot the contours of the correlation matrix of 
inventory variation for all investors, for individuals and 
for institutions, where the investors are sorted according 
to their cross-correlation coefficients Cvr of the inven- 
tory variation with the price return. Figure [BJd) shows 
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FIG. 6. (Color online) Panels (a-c) show the scatter plots of Cvr versus a proxy of the size of the investor. For each stock, 
the proxy is the ratio of the value exchanged by the investor (scaled by a factor 10 4 ) to the capitalization of the stock. Each 
marker refers to an investor trading a specific stock. The three kinds of markers refer to investors whose inventory variations are 
positively correlated (0)> negatively correlated (□), or uncorrelated (a) with returns according to the block bootstrap analysis. 
The two dashed lines indicate the 2a threshold calculated using Eq. Panels (d-f) are contour plots of the correlation 

matrix of daily inventory variation of investors trading the stock 000001. We have sorted the investors into rows and columns 
according to their cross-correlation coefficients of inventory variation with its price return Cvr- The evolution of Cvr in the 
same order as in the matrix is shown in the bottom panel, where the dashed lines bound the ±2<r significance intervals. 



that the left-top corner gives large positive Cjj values and 
the left-bottom and right-top corners gives large nega- 
tive dj values, as expected. Figure EJe) give better re- 
sults for individual investors, validating the linear model 
(fT!?|) . The results in Fig. |5ff) are worse for institutional 
investors, which is due to the fact that most Cvr val- 
ues are small for institutions, as illustrated in Fig. IH^c). 
However, Fig. [6jf ) doses not invalidate the linear model, 
since there are only three trending institutions and one 
reversing institution. Indeed, the situation is quite simi- 
lar for other individual stocks with very few investors of 
the same category, as shown in Table |U 



B. Causality 

In Sec. IIV Ai we have shown that the inventory varia- 
tion Vi(t) and the stock return R(t) have significant posi- 
tive or negative correlation for part of the investors. It is 
interesting to investigate the lead-lag structure between 
these two variables. For the largest majority of revers- 
ing and trending firms in the Spanish stock market, it 
is found that returns Granger cause inventory variation 
but not vice versa at the day or intraday level, and the 
Granger causality disappears over longer time intervals 
Here, we aim to study the same topic for both indi- 
vidual and institutional investors. 



We first investigate the autocorrelation function 
Cv(t)v(t+T) of the inventory variation time series sampled 
in 15-min time intervals. FigureUJa) shows the three au- 
tocorrelation functions for all the trending, reversing and 
uncatcgorized investors. Each autocorrelation function 
is obtained by averaging the autocorrelation functions of 
the investors in the same category to have better statis- 
tics. It is found that the inventory variation is long-term 
correlated and the correlation is significant over dozens of 
minutes, which can be partly explained by the order split- 
ting behavior of large investors [27H29j . We also find that 
the correlation is stronger among trending investors than 
reversing investors. Figure UJb) and Fig. [Tic) illustrate 
the results for individuals and institutions. We observe 
that institutions have stronger long memory than indi- 
viduals. It implies that institutions are more specialized 
to their trading strategies than individuals [f| . 

Panels (d-f) of Fig. [7J illustrate the averaged lagged 
cross-correlation functions Cy(t)R(t+T) between inventory 
variations and returns. The results in the three panels 
are qualitatively the same. For uncategorized investors, 
no significant cross-correlations are found between inven- 
tory variations and returns, which is trivial due to the 
"definition" of this category, as shown in Fig. [6l a-c). For 
trending and reversing investors, it is evident that the 
returns lead the inventory variations by dozens of min- 
utes (t < 0), where the cross-correlation Cv(t)R(t+r) is 
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5 min 0.5 hour 1 hour 1 day 1 week 5 min 0.5 hour 1 hour 1 day 1 week 5 min 0.5 hour 1 hour 1 day 1 week 
AT AT AT 




CvR CvR CvR 



FIG. 7. (Color online) The first column (a,d,g,j) shows the results for all investors. The second column (b,e,h,k) shows the 
results for all individual investors. The third column (c,f,i,l) shows the results for all institutional investors, (a-c) Averaged 
autocorrelation functions Cv(t)v{t+r) of the 15-min inventory variation V for trending, reversing and uncategorized investors. 
The dashed lines give the 5% significance level, (d-f) Averaged lagged cross-correlation functions Cv(t)R(t+T) or the 15-min 
inventory variation for trending, reversing and uncategorized investors. The dashed lines bound the ±2<r significance interval, 
(g-i) Conditional expected value of the indicator I(x —¥ y) of the rejection of the null hypothesis of non-Granger causality 
between x and y with 95% confidence as a function of time horizons AT. The dashed lines show the 5% significance level, (j-1) 
Conditional expected value of the indicator I(x -)• j) as a function of the simultaneous cross-correlation C[Vi(t), R(t)]. The 
black symbols refer to the Granger test on shuffled data and the dashed lines bound ±2<r significance interval. 



significantly nonzero. When the price drops, trending in- 
vestors will sell stock shares to reduce their inventory in 
a few minutes, while reversing investors will buy shares 
to increase their inventory. When the price rises, trend- 
ing investors will buy shares to increase their inventory 
in a few minutes, while reversing investors will sell shares 



to reduce their inventory. In the meanwhile, we also ob- 
serve nonzero cross-correlations for r > in shorter time 
periods, which means that the inventory variations lead 
returns. 

To further explore the lead-lag structure between in- 
ventory variations and returns, we perform Granger 
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causality analysis. Wc define an indicator I(X —> Y), 
whose value is 1 if X Granger causes Y and otherwise 
In our analysis, the time resolution of the two time 
series is 15-min. The values of I(V —> R) and I(R — > V) 
for all investors arc determined at different time scales 
AT. The average indicator values E[I(V — > R)] and 
E[I(R — > V)] arc plotted in Fig. E[g-i) with respect 
to AT for all investors, individual investors and insti- 
tutional investors. Both I(V — > R) and I(R — > V) are 
decreasing functions of AT. We note that E[I(X — > Y)] 
is the percentage of investors with I(X — > Y) = 1. Figure 
[7]shows that there are more investors with I{R — > V) = 1 
than investors with I(V — > R) = 1. On average, bidirec- 
tional Granger causality is observed at the intraday time 
scales and the Granger causality disappears at the weekly 
level or longer. Moreover, individual investors are more 
probable to be influenced by the intraday price fluctua- 
tions than institutions, because the I(R — > V) values of 
individuals are greater than those of institutions at the 
same time scale level. 

We then investigate the impact of investor category 
on the causality indicator. The results for AT = 4 
(i.e., one hour) arc depicted in Fig. EJj-1). The mid- 
dle parts bounded by two vertical lines at Cvr = ±cr 
correspond to uncategorized investors. The left parts 
(Cvr < — c) correspond to reversing investors and the 
right parts Cvr > ±c correspond to trending investors. 
It is found that a investor adjusts his inventory following 
price fluctuations with very large probabilities when his 
|Cyij| value is large. This conclusion holds for both in- 
dividual and institutional investors. The strong Granger 
causality from inventory variations to returns and the 
weak but significant causality from returns to inventory 
variations cannot be attributed to the non-Gaussianity 
in the distributions of the variables, as verified by boot- 
strapping analysis. Qualitatively similar results are ob- 
tained for other AT values. 



C. Herding behavior 

Herding and positive feedbacks are essential for the 
boom of bubbles [3(1 HU . These topics have been stud- 
ies extensively to understand the price formation process 
[Hl-dU . Herding is a phenomena that a group of investors 
trading in the same direction over a period of time. Here, 
we try to investigate possible herding behaviors in differ- 
ent groups of investors. 

We study possible buy and sell herding behaviors 
among the same group of investors. Investors arc classi- 
fied into different groups based on their types (individual 
or institution) and their categories (reversing, trendingor 
uncategorized). We define a herding index as follows [5|: 



h = 



N+ 



N+ + N- 



(14) 



TABLE II. Number of herding days for different groups of 
investors. The total number of trading days is 237. The su- 
perscripts "+" and "— " indicate buy herding and sell herding, 
respectively. The subscripts "d" and "s" indicate individuals 
and institutions, respectively. The time horizon is one day. 



Reversing 



Trending 



Uncategorized 



Code 




n d 


nj 


n s 


nj 


n d 


nt 


n s 


4 


n d 


n+ 


n s 


000001 


60 


63 




















6 


8 








000002 


23 


19 




















3 


6 








000012 


26 


37 




















2 


1 








000021 


54 


62 




















4 


3 


1 





000063 


34 


34 


2 


1 




















1 


1 


000488 


























2 


1 








000550 


17 


34 


























2 


1 


000625 


21 


15 








3 


7 








5 


2 








000800 


29 


25 




















7 


7 








000825 


34 


31 
































000839 


61 


58 




















7 


7 








000858 


38 


36 























1 








000898 


55 


40 




















3 


4 








200488 


31 


24 




















2 


7 





3 


200625 


6 


6 




















3 


6 





3 



where N + is the number of buying investors and N is 
the number of selling investors in the same group over a 



given time horizon. When the herding index h is smaller 
than 5% under a binomial null hypothesis, we assess that 
herding is present. In our analysis, we fix the time hori- 
zon into one day and determine the number of days that 
herding was present for different groups of investors. The 
results are depicted in Table |TTJ 

According to Table lU there are no buying and sell- 
ing herding days observed for trending institutions. For 
trending individuals and reversing institutions, herding 
is observed in only one stock on very few days. For cate- 
gorized investors, we see slightly more herding days in a 
few stocks. For reversing investors, the number of herd- 
ing days is greater than for other investors and we ob- 
serve comparable buying and selling herding days. Our 
findings are consistent with those for the Spanish stock 
market, especially in the sense that reversing investors 
arc more likely to herd Q. Our analysis also allows us 
to conclude that individuals are more likely to herd than 
institutions in 2003. 



V. SUMMARY 

In summary, we have studied the dynamics of in- 
vestors' inventory variations. Our data set contains 15 
stocks actively traded on the Shenzhen Stock Exchange 
in 2003 and the investors can be identified as cither indi- 
viduals or institutions. 

We studied the cross-correlation matrix CV,- of inven- 
tory variations of the most active individual and insti- 
tutional investors. It is found that the distribution of 
cross-correlation coefficient Cij is asymmetric and has a 
power-law form in the bulk and exponential tails. The in- 
ventory variations exhibit stronger correlation when both 
investors are either individuals or institutions, which in- 
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dicates that the trading behaviors are more similar within 
investors of the same type. The eigenvalue spectrum 
shows that the largest and the second largest eigenvalues 
of the correlation matrix cannot be explained by the ran- 
dom matrix theory and the components of the first eigen- 
vector u(Ai) carry information about stock price fluctu- 
ation. In this respect, the behaviors differ for individual 
and institutional investors. 

Based on the contemporaneous cross-correlation coef- 
ficients Cvr between inventory variations and stock re- 
turns, we classified investors into three categories: trend- 
ing investors who buy (sell) when stock price rises (falls) , 
reversing investors who sell (buy) when stock price rises 
(falls), and uncategorized investors. We also observed 
that stock returns predict inventory variations. It is in- 
teresting to find that about 56% individuals hold trend- 
ing or reversing strategies and only 18% institutions hold 



strategies. Moreover, there are far more reversing indi- 
viduals (50%) than trending individuals (6%). In con- 
trast, there are slightly more trending institutions (10%) 
than reversing institutions (8%). Hence, Chinese indi- 
vidual investors are prone to selling winning stocks and 
buying losing stocks, which provides supporting evidence 
that trading is hazardous to the wealth of individuals 
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